Adversarial Machine Learning at Scale
نویسندگان
چکیده
Adversarial examples are malicious inputs designed to fool machine learning models. They often transfer from one model to another, allowing attackers to mount black box attacks without knowledge of the target model’s parameters. Adversarial training is the process of explicitly training a model on adversarial examples, in order to make it more robust to attack or to reduce its test error on clean inputs. So far, adversarial training has primarily been applied to small problems. In this research, we apply adversarial training to ImageNet (Russakovsky et al., 2014). Our contributions include: (1) recommendations for how to succesfully scale adversarial training to large models and datasets, (2) the observation that adversarial training confers robustness to single-step attack methods, (3) the finding that multi-step attack methods are somewhat less transferable than singlestep attack methods, so single-step attacks are the best for mounting black-box attacks, and (4) resolution of a “label leaking” effect that causes adversarially trained models to perform better on adversarial examples than on clean examples, because the adversarial example construction process uses the true label and the model can learn to exploit regularities in the construction process.
منابع مشابه
Adversarial and Secure Machine Learning
The advance of machine learning has enabled establishments of many automatic systems, leveraging its outstanding predictive power. From face recognition to recommendation systems and to social network relationship mining, machine learning found its rising attention from both researchers and practitioners in many different domains. Data-driven technologies based on machine learning facilitate th...
متن کاملWild Patterns: Ten Years After the Rise of Adversarial Machine Learning
Learning-based pattern classifiers, including deep networks, have demonstrated impressive performance in several application domains, ranging from computer vision to computer security. However, it has also been shown that adversarial input perturbations carefully crafted either at training or at test time can easily subvert their predictions. The vulnerability of machine learning to adversarial...
متن کاملOn the Connection between Differential Privacy and Adversarial Robustness in Machine Learning
Adversarial examples in machine learning has been a topic of intense research interest, with attacks and defenses being developed in a tight back-and-forth. Most past defenses are best-effort, heuristic approaches that have all been shown to be vulnerable to sophisticated attacks. More recently, rigorous defenses that provide formal guarantees have emerged, but are hard to scale or generalize. ...
متن کاملAdversarial examples in the physical world
Most existing machine learning classifiers are highly vulnerable to adversarial examples. An adversarial example is a sample of input data which has been modified very slightly in a way that is intended to cause a machine learning classifier to misclassify it. In many cases, these modifications can be so subtle that a human observer does not even notice the modification at all, yet the classifi...
متن کاملAdversarial learning: the impact of statistical sample selection techniques on neural ensembles
Adversarial learning is a recently introduced term which refers to the machine learning process in the presence of an adversary whose main goal is to cause dysfunction to the learning machine. The key problem in adversarial learning is to determine when and how an adversary will launch its attacks. It is important to equip the deployed machine learning system with an appropriate defence strateg...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1611.01236 شماره
صفحات -
تاریخ انتشار 2016